Feature Identification Using Interpretability Machine Learning Predicting Risk Factors for Disease Severity of In-Patients with COVID-19 in South Florida

Objective: The objective of the study was to establish an AI-driven decision support system by identifying the most important features in the severity of disease for Intensive Care Unit (ICU) with Mechanical Ventilation (MV) requirement, ICU, and InterMediate Care Unit (IMCU) admission for hospitalized patients with COVID-19 in South Florida. The features implicated in the risk factors identified by the model interpretability can be used to forecast treatment plans faster before critical conditions exacerbate. Methods: We analyzed eHR data from 5371 patients diagnosed with COVID-19 from South Florida Memorial Healthcare Systems admitted between March 2020 and January 2021 to predict the need for ICU with MV, ICU, and IMCU admission. A Random Forest classifier was trained on patients’ data augmented by SMOTE, collected at hospital admission. We then compared the importance of features utilizing different model interpretability analyses, such as SHAP, MDI, and Permutation Importance. Results: The models for ICU with MV, ICU, and IMCU admission identified the following factors overlapping as the most important predictors among the three outcomes: age, race, sex, BMI, diarrhea, diabetes, hypertension, early stages of kidney disease, and pneumonia. It was observed that individuals over 65 years (‘older adults’), males, current smokers, and BMI classified as ‘overweight’ and ‘obese’ were at greater risk of severity of illness. The severity was intensified by the co-occurrence of two interacting features (e.g., diarrhea and diabetes). Conclusions: The top features identified by the models’ interpretability were from the ‘sociodemographic characteristics’, ‘pre-hospital comorbidities’, and ‘medications’ categories. However, ‘pre-hospital comorbidities’ played a vital role in different critical conditions. In addition to individual feature importance, the feature interactions also provide crucial information for predicting the most likely outcome of patients’ conditions when urgent treatment plans are needed during the surge of patients during the pandemic.


Introduction
The COVID-19 pandemic has significantly affected global health and economies, overwhelming hospitals and straining limited resources [1].As of April 2024, there have been over 700 million confirmed cases of COVID-19 and over 7 million deaths worldwide, with Florida alone accounting for almost 1.14% of all confirmed cases and approximately 1.36% of worldwide deaths [2].The immense strain on healthcare systems, particularly in South Florida, necessitated the development of a predictive model for Intensive Care Unit (ICU) with Mechanical Ventilation (MV) requirement, ICU, or InterMediate Care Unit (IMCU) admission to effectively allocate resources, optimize patient care, and improve outcomes for patients with COVID-19.All patients who required MVs were admitted to the ICU; however, not all patients in the ICU necessarily needed MV.For ease of presentation, we shall refer to ICU patients with MV as only MV for brevity.
The COVID-19 pandemic has emphasized the critical need for optimizing resource allocation in healthcare systems.The demand for ICU beds surged due to COVID-19, with estimates of ICU admission rates ranging from 5% to 30% [3,4].This increased the demand for critical care resources, including ventilators, posed challenges, and led to shortages during the peak of the pandemic [5].Accurately predicting patients at a higher risk of requiring intensive care interventions allows healthcare providers to proactively allocate resources such as ICU beds, ventilators, and specialized staff to the patients who need them the most [6].This targeted resource allocation ensures that critical care resources are utilized effectively and judiciously, maximizing their impact on patient outcomes.
In addition, predicting IMCU admission allows for proper triage and prevents the underutilization or overburdening of ICU resources.The early identification of patients requiring IMCU admission enables healthcare providers to intervene promptly and provide appropriate care.Patients in the IMCU may require specialized monitoring, non-invasive ventilation, or other interventions to manage their respiratory status or other critical care needs.Not all patients require the care provided in an ICU, but they may still benefit from closer monitoring and specialized interventions available in the IMCU.This improves care coordination, patient flow, and cost-effective resource management.It allows hospitals to anticipate the number of patients requiring intensive care, plan staffing and equipment needs, as well as coordinate care across different healthcare facilities.By adopting a proactive approach, healthcare systems can better manage an influx of patients, maintain quality care, and ensure that critical care is provided to those in need [7].Timely, individualized interventions can also help prevent disease progression, reduce complications, and improve patient outcomes [8].
Given the demands for critical decisions in the healthcare system, clinicians can also benefit from AI-driven decision support systems for deciding optimal treatment plans for hospitalized patients with COVID-19 when there is an urgent need for decision making.AI-driven decision support systems can provide insights into disease severity, prognosis, and enable healthcare providers to better communicate disease diagnoses to patients and their families, make informed decisions about end-of-life care, and allocate resources appropriately based on the patient's likelihood of recovery [9].AI has been successfully deployed in clinical settings to aid in many healthcare decisions [10], such as detecting diseases [11], risk assessments [12], and personalized outcomes [13,14].

An Overview of Machine Learning Studies on COVID-19 Disease Severity
The literature on COVID-19 disease severity has undergone thorough analyses using various predictive methods.Machine Learning (ML) has been used to develop scoring tools based on essential features to measure COVID-19 disease severity [15][16][17] and to predict the disease severity scores of patients presented by a standardized severity scale, such as the National Early Warning Score 2 (NEWS2) [18].Thus, ML tools can learn to predict outcomes based on a known severity metric or establish a new severity scale that improves the known risk assessment methods by incorporating novel features not included in the standardized scales [17].Furthermore, past studies have focused on predicting the severity conditions of in-patients, such as MV requirements [18][19][20][21], days spent in ICU [20][21][22][23][24], whether rehospitalization is necessary for recurring health problems [15,25], and forecasting the need for other therapeutic interventions [24].
The studies have used various datasets to train their models; some relied solely on lab biomarkers [24,26], while others utilized medical histories like electronic Health Records (eHRs) and demographic information [21], and a few combined both data [19,23,25].Additionally, some studies incorporated features from medical imaging, eHRs, and laboratory tests [22,27,28].It is worth noting that the performance of the ML models reported in the studies varied considerably.Some achieved a high performance with an Area Under the Curve (AUC) exceeding 0.90, specifically in studies with smaller cohorts [27,29].This could be potentially due to overfitting the specific population tested.For instance, Hong et al. [29] reported an accuracy of over 0.90 for ML prediction when trained on only 63 patients with COVID-19 to predict the severity of illness.Similarly, Liu et al. [27] achieved very high accuracy (AUC > 0.96) on a small patient cohort of approximately 158 individuals.Nevertheless, studies that involve larger cohorts of over 5000 patients [15,21,25] exhibit slightly lower accuracy (AUC < 0.90), but offer generalizability over a larger population.
Furthermore, numerous studies on lab biomarkers have yielded notably robust accuracy.The same performance can also be achieved when the model has been trained on demographics and eHR data, suggesting that a viable alternative during a pandemic crisis when hospital resources are limited and obtaining lab test results become challenging.Multiple ML approaches have been employed, including boosting-based classifiers [18,[24][25][26][27], regression methods [15,17,25,30], Support Vector Machine [18,24,27], Random Forest (RF) classifier [18,24,27,30], and Naïve Bayes classifier [17,26].In comparative studies assessing the accuracy of various ML models, it is worth noting that boosting methods, such as Catboost, and XGBoost, have been reported to deliver the highest accuracy, as documented by Noy et al. [18], Liu et al. [27] and Hong et al. [29].Nevertheless, several other comparative studies [24,30] have indicated that an RF classifier outperforms other ML models.Consequently, our selection of classification methods remains competitive with previous studies.
Additionally, investigations into Deep Learning (DL) neural network-based approaches [21,23] produced similar accuracy results to their ML counterparts and have also been shown to outperform in certain cases [23].Furthermore, a hybrid approach has been reported combining DL and ML into a single predictive architecture, where the DL extracted features for chest CT scan images utilized by a CatBoost model downstream along with lab biomarkers and eHR data to predict COVID-19 disease severity with high accuracy were employed [22].This allows for integrating radiological information with other laboratory and clinical reports, further enabling predictive models to make critical decisions [22].
It has been shown that the studies incorporating eHRs and lab tests tend to assign higher importance to lab biomarkers in their feature ranking as they are direct measures of current health conditions pertaining to disease severity [18,22,24].However, studies that relied exclusively on eHR data also performed comparably, identifying comorbidities and other demographic variables as top-ranking features [20].SHAP analysis can provide additional insights by emphasizing individual features with a more pronounced impact on severity [18].For instance, it has been found that older age, lower platelet counts, and lower lymphocyte levels are closely associated with COVID-19 disease severity [18].It is worth noting that the exploration of individual class-based SHAP scores for demographic data has been limited and mainly exists in studies focusing on lab biomarkers.Our current study presents SHAP scores for each feature that contributes to predicting MV requirement, ICU, or IMCU admission.

Contributions of the Current Study
Our current study focuses on a large South Florida cohort, a previously explored dataset for COVID-19 disease severity predictive analysis.Previously, Datta et al. [31] utilized the same dataset to find the most critical features underlying mortality risk, utilizing an RF classifier and SHAP-based interpretability.A comparative analysis utilizing the performance of traditional ML, DL, and the fine-tuned Large Language Model (LLM) in our previous study revealed that the RF model has the highest precision and accuracy compared to other traditional ML (e.g., XGBoost and KNN) or DL (e.g., MLP) models [32].
Here, we extended the work to understand the essential features indicative of COVID-19 disease severity for in-patients from South Florida for predicting therapeutic interventions, such as ICU with MV requirement, ICU, and IMCU admission.A similar severity prediction was performed using DL on the same dataset [33]; however, the severity scale was not well-defined.Furthermore, the analysis lacked the ability to interpret the model's decision.The current study explores a better metric for patients, caregivers, and clinicians, which is a direct prediction of therapeutic interventions.Additionally, numerous studies focused solely on comparing ML models and did not examine different interpretability approaches for feature analysis.
Thus, our contributions to this work are two-fold: we determine the severity of COVID-19 disease by utilizing the three distinct RF classifiers, and compare the importance of features across the cases susceptible to the conditions.This comparison shows how different features are similar across different severity cases and vary across the conditions.
To enhance the reliability of the findings, we utilized multiple interpretability methods to better understand the essential features, as their importance varies across many studies.In addition, we analyzed the interactions between SHAP features to understand how combinations of features impact the severity of COVID-19 disease.

Dataset Collection and Subject Information
With the exemption of informed consent and the HIPAA waiver, the Institutional Review Board (IRB) approved the study and is exempt from further review.From 14 March 2020 to 16 January 2021, data were obtained from the Memorial Healthcare System (MHS), Hollywood, FL, USA and investigated by the co-authors from the Christine E. Lynn College of Nursing and College of Engineering and Computer Science at Florida Atlantic University.

Study Design Considerations
Figure 1 presents the visual abstract of a complete data science cycle.The input dataset for our analysis consisted of 5594 patients admitted to the hospital with COVID-19-related symptoms.Among the 5371 patients with COVID-19, 4296 (80%) were in the training dataset, while the remaining 1075 (20%) were in the test dataset.Preprocessing steps ensured data quality by eliminating repeating variables and removing the features with over 10% missing values [34] from the patients' eHRs.In the current study, only BMI had 6.5% missing values among the remaining independent variables, for which we utilized the Bayesian Ridge Imputation method [30].Three separate models were trained utilizing 25 variables, comprising 24 independent variables for each dependent variable: MV, ICU, or IMCU.utilizing 25 variables, comprising 24 independent variables for each dependent variable: MV, ICU, or IMCU.

Data Classification
The current research is based on binary classification problems with one of three dependent variables (MV, ICU, or IMCU).Each dependent variable was used in a different model and segmented into binary consequences, classified either as 'MV requirement' ('1') vs. 'no MV requirement' ('0') or 'ICU admission' ('1') vs. 'no ICU admission' ('0'), or 'IMCU admission' ('1') vs. 'no IMCU admission' ('0').The labels were then converted to binary numerical values before training the models.A classification model was trained on the observed values (independent variables, see Table 1) based on the inputs being binary (e.g., sex and diarrhea) or multiclass (e.g., age and BMI), and the model predicted the outputs (dependent variable) of the binary class.

Data Classification
The current research is based on binary classification problems with one of three dependent variables (MV, ICU, or IMCU).Each dependent variable was used in a different model and segmented into binary consequences, classified either as 'MV requirement' ('1') vs. 'no MV requirement' ('0') or 'ICU admission' ('1') vs. 'no ICU admission' ('0'), or 'IMCU admission' ('1') vs. 'no IMCU admission' ('0').The labels were then converted to binary numerical values before training the models.A classification model was trained on the observed values (independent variables, see Table 1) based on the inputs being binary (e.g., sex and diarrhea) or multiclass (e.g., age and BMI), and the model predicted the outputs (dependent variable) of the binary class.
Of the 24 independent variables used in this research, age and BMI were converted from continuous to categorical variables.Age was categorized based on age matrices [35], where 'younger adults' ranged between 20 and 34, 'middle adults' ranged between 35 and 64, and 'older adults' ranged between 65 and 90 years.Similarly, BMI was categorized based on the BMI metrics [36], with 'underweight' defined as below 18.50, 'normal weight' between 18.5 and 24.9, 'overweight' between 25 and 29.9, and 'obese' defined as 30 or greater (see Table 1).
In addition, an improved alternative for dummy coding demonstrated in Table 1 was adapted from our previous study on the same dataset [31] in place of 'one-hotencoding' [37].This approach harnessed the capabilities and efficiency of AI modeling while leveraging interpretability and domain knowledge.Thus, the dummy coding facilitated the effective comprehension of the feature analysis.It is important to acknowledge that the model's optimal performance may be affected due to biases introduced in this approach.Dummy coding is adopted from our previous study [31].
However, it simultaneously equips healthcare providers with a well-defined understanding of the health status based on the features analyzed.For example, this research delved into inquiries, such as which age group or presence of diabetes had a more pronounced impact, whether patients taking medication for hypertension (ARBs and ACEIs) yielded benefits, whether individuals with diarrhea exhibited a higher predictability of requiring ICU admission, or if the combination of two variables (interaction effect) exerted a more substantial influence than a single variable.

Correlation Check
Tetrachoric correlation analysis was employed to acquire a more in-depth understanding of the effectiveness and suitability of 24 independent variables.We chose the tetrachoric correlation due to the categorical nature of our variables, which cannot be analyzed using the Pearson correlation analysis.In addition, the tetrachoric correlation can also handle skewness and outliers in the dataset, capture non-linear correlations between variables, and deal with ordinal data [38].
The analysis revealed that the correlation coefficients among all pairs of variables were generally low (<0.50), with one exception being the positive correlation observed between CKD stage 5 and the dependence on renal dialysis, presenting a correlation coefficient of 0.66.The correlation analysis was conducted solely for exploratory data analysis and was not used as a guiding principle for feature selection.The model's accuracy can potentially be enhanced when two correlated features are part of the same dataset, as discussed by Deb et al. [39].

Resampling Data
The data were found to be imbalanced based on the information above.Oversampling was utilized to balance uneven datasets.Synthesizing new examples instead of duplicating was performed using the Synthetic Minority Over-sampling TechniquE (SMOTE) to balance the data [41].SMOTE was applied to the training dataset, but the test dataset did not undergo any modifications to prevent data leakage issues [39].

Cohort Description
The table below describes the categorical variables and corresponding dummy coding values.
We present the rest of the results in the following order: MV requirement analysis is reported first, followed by ICU admission and IMCU admission.Section 3 includes statistical analysis, model prediction, feature interpretability, and interactions.

Statistical Analysis
Three individual chi-squares were used to estimate the predictive value of 24 independent variables in identifying individuals likely to require MV, or be admitted to ICU or IMCU [42].
As shown in Table 2, 12 of the 24 variables were statistically significant in predicting the likelihood of requiring MV.These variables included age, sex, diabetes, hypertension, CKD stages 1-4, CKD stage 5, heart failure, coronary artery disease, liver disease, pneumonia, diarrhea, and dependence on renal dialysis.The highest risk factors associated with MV were diarrhea (OR = 7.2), CKD stages 1-4 (OR = 2.89), and hypertension (OR = 2.75).
As shown in Table 3, among the 24 variables, 15 were statistically significant in predicting the likelihood of admitting to the ICU.These variables included age, BMI, sex, race, diabetes, hypertension, Chronic Obstructive Pulmonary Disease (COPD), CKD stages 1-4, CKD stage 5, heart failure, coronary artery disease, liver disease, pneumonia, diarrhea, and dependence on renal dialysis.The variables associated with the highest risk factors were diarrhea (OR = 8.86) and age (OR = 3.43), with older adults being 3.43-times more likely to be admitted to the ICU than younger adults.
As shown in Table 4, 15 of the 24 variables were statistically significant in predicting the likelihood of admitting to the IMCU.These variables included age, sex, smoking status, diabetes, hypertension, COPD, CKD stages 1-4, CKD stage 5, heart failure, cardiac arrhythmias, coronary artery disease, pneumonia, ARBs, ACEIs, and diarrhea.The highest risk factors were diarrhea (OR = 2.86) and age (OR = 2.74), with older adults being 2.74-times more likely to be admitted to the IMCU than younger adults.Using traditional statistical approaches, multiple options exist for selecting the best key features.One of the most popular approaches involves both forward and backward stepwise approaches.We chose the backward Wald stepwise binary logistic regression method due to its conservativeness and low likelihood of introducing false positives to the model [43][44][45].
Several problems have been noted with this traditional approach.The biggest issue is that these approaches can overfit the model specifically for that sample, and therefore, selecting the key features can vary from sample to sample [46,47].To compensate for this potential problem, a 10-fold cross-validation approach was used with Wald's backward binary logistic regression.This validation routine randomly parsed the data into 9 separate training datasets of 573 patients and a 10th testing dataset with 574 patients.Mean Squared Errors (MSEs) and R 2 were compared in each fold.Lastly, a final imputed dataset was constructed from the features identified in the training dataset and compared with the values in the test dataset to obtain the final variable selection and R 2 .Tables 5-7 report the three separated Wald's backward binary logistic regression results for MV, ICU, and IMCU.
As seen in Table 5, of the initial 24 features used to predict the likelihood to require MV, the following 14 are retained by the model: age, BMI, sex, race, diabetes, hypertension, CKD stages 1-4, cardiac arrhythmias, cerebrovascular disease, pneumonia, ARBs, ACEIs, diarrhea, and dependence on renal dialysis.These features were statistically significant in predicting MV requirement and accounted for approximately 20% of the variability in the model [χ 2 (8) =373.96,p < 0.001, Nagelkerke R 2 = 0.20] with a 92.5% overall correct classification rate.Of these retained variables, diarrhea had the largest odds ratio in the multivariate binary logistic regression model; those diagnosed with diarrhea were 6.31-times more likely to require MV than those who did not.As seen in Table 6, of the initial 24 features used to predict patients admitted to the ICU, the following 15 are retained by the model: age, BMI, sex, race, diabetes, hypertension, asthma, CKD stages 1-4, heart failure, cardiac arrhythmias, cerebrovascular disease, pneumonia, ARBs, ACEIs, and diarrhea.These features were statistically significant in predicting ICU admissions and accounted for approximately 26% of the variability in the model [χ 2 (8) = 374.96,p < 0.001, Nagelkerke R 2 = 0.26] with an 89.4% overall correct classification rate.Of these retained variables, diarrhea had the highest odds ratio in the multivariate binary logistic regression model; those diagnosed with diarrhea were 8.52-times more likely to be admitted to the ICU than those who did not.
Lastly, Table 7 presents the results for predicting the likelihood of being admitted to the IMCU; of the initial 24 features, the following 14 were retained by the model: age, BMI, sex, race, ethnicity, diabetes, hypertension, COPD, CKD stages 1-4, heart failure, cardiac arrhythmias, pneumonia, ACEIs, and diarrhea.These features were statistically significant in predicting IMCU admissions and accounted for approximately 11% of the variability in the model [χ 2 (15) = 374.96,p < 0.001, Nagelkerke R 2 = 0.11] with an 81.5% overall correct classification rate.Of these retained variables, diarrhea had the largest odds ratio in the multivariate binary logistic regression model; those diagnosed with diarrhea were 2.56-times more likely to be admitted to IMCU than those who did not.
To further assess the potential of invariances across gender, age, and ethnicity, sets of backward binary logistic regressions were conducted and cross-validated for each group.Tables 8-10 report all the variables we identified as important for predicting MV requirement and ICU and IMCU admissions across groups.The classification accuracy for MV by sex ranged from 91.5% for males to 93.6% for females.There were more significant differences in model accuracy between the age categories, with older adults having the lowest accuracy (89.7%) compared to younger adults (97.3%) and 'middle' adults (94.1%).The accuracy for the non-Hispanic and Hispanic groups was approximately equal for MV, with accuracy values of 92.4% and 92.7%, respectively.
The highest overall accuracy across all groups was observed for predicting ICU admissions.All groups reported an accuracy value above 90% for predicting ICU admission, except for ethnicity, approaching 90% accuracy (non-Hispanic = 89.6% and Hispanic = 89.8%).
When assessing IMCU admissions, the overall accuracy decreased, ranging from a low of 76.4% for older adults to a high of 90.5% for younger adults.As can be seen, some variabilities across the features are important for predicting MV, ICU, and IMCU outcomes among the demographic groups of sex, age, and ethnicity.This suggests the necessity of retaining demographic variables in all future analyses.

Model Performance
The models' performances on the imbalanced dataset were assessed with three metrics: precision, recall, F1-score (weighted), and AUC.Precision measures how many of the positive predictions made by the classifier are correct, i.e., 'of all the predicted severe cases, how many are actually severe?' Recall quantifies how well the classifier identifies all positive cases, i.e., 'what proportion of actual severe cases did the model correctly predict as severe?' F1 score balances precision and recall, considering both false positives and negatives.The harmonic mean of precision and recall is used to estimate the F1 score, where the best value is 1.0 and the worst value is 0.0 [48].Upon training the RF classifier, we obtained F1-scores (Figure 2a) of 89% (precision: 88% and recall: 90%), 87% (precision: 87% and recall: 88%), and 75% (precision: 73% and recall: 78%), respectively, for MV, ICU, and IMCU.
Diagnostics 2024, 14, x FOR PEER REVIEW 16 of 30 all positive cases, i.e., 'what proportion of actual severe cases did the model correctly predict as severe?' F1 score balances precision and recall, considering both false positives and negatives.The harmonic mean of precision and recall is used to estimate the F1 score, where the best value is 1.0 and the worst value is 0.0 [48].Upon training the RF classifier, we obtained F1scores (Figure 2a) of 89% (precision: 88% and recall: 90%), 87% (precision: 87% and recall: 88%), and 75% (precision: 73% and recall: 78%), respectively, for MV, ICU, and IMCU.Model performance was relatively poor in predicting IMCU admissions because, in this case, the model displayed an inadequate performance in the majority class, which resulted in a poor F1 score.The above results are reported after the models have been optimized for hyperparameter tuning using the grid search K-fold cross-validation (10fold) method [49,50].
Confusion matrices (Figure 2b The models' performances were also evaluated using the AUC of the Receiver Operating Characteristic (ROC) analysis.A perfect model would provide an AUC of 1, and an uninformed model would provide an AUC of 0.5.A probability curve plots the TP versus the FP rate at different threshold values and distinguishes between 'severity' and 'no-severity' cases.
The reported ROC-AUC (Figure 3a) values for the predictions of MV, ICU, and IMCU were 73% (95% CI: 0.66-0.79),80% (95% CI: 0.75-0.85),and 64% (95% CI: 0.60-0.69),respectively.In the current study, AUC performance for MV and IMCU was poorer than ICU.Model performance was relatively poor in predicting IMCU admissions because, in this case, the model displayed an inadequate performance in the majority class, which resulted in a poor F1 score.The above results are reported after the models have been optimized for hyperparameter tuning using the grid search K-fold cross-validation (10-fold) method [49,50].
Confusion matrices (Figure 2b The models' performances were also evaluated using the AUC of the Receiver Operating Characteristic (ROC) analysis.A perfect model would provide an AUC of 1, and an uninformed model would provide an AUC of 0.5.A probability curve plots the TP ver-sus the FP rate at different threshold values and distinguishes between 'severity' and 'no-severity' cases.
The reported ROC-AUC (Figure 3a) values for the predictions of MV, ICU, and IMCU were 73% (95% CI: 0.66-0.79),80% (95% CI: 0.75-0.85),and 64% (95% CI: 0.60-0.69),respectively.In the current study, AUC performance for MV and IMCU was poorer than ICU. .This shows ROC curve for the test data, the prediction of ICU achieves the best performance.It should be noted that FP is 1 minus the specificity (TN), which means that the closer FP is to 0, the higher the sensitivity (TN).Therefore, the point should be in the top-left corner of the ROC curve to obtain the optimal values for specificity and sensitivity.
In the test data, we observed an inadequate model performance in predicting minority classes.The imbalanced nature of the dataset caused this discrepancy.The model is biased toward the data's predominant nature to predict the majority class; for instance, predicting 'ICU' is lower in accuracy than 'no ICU'.

Model Interpretability
Three interpretability (post hoc) methods were used to explain the models' performance.These included MDI (Mean Decrease in Impurity), Permutation Importance, and SHAP for being the most common interpretable techniques based on our background literature reviews [15,22,23,26,31].
The first interpretability method was MDI, based on the Gini impurity values of the RF, where splitting rules maximize impurity reduction [51][52][53][54].Impurity indicates how well a node can classify a dataset.The Gini index determines the impurity by calculating the probability of misclassifying a randomly selected element from the dataset [55].The term 'impurity' determines how homogeneous or mixed the classes are in a node, where 'zero' impurity has only one class, and a node with maximum impurity has an equal mix of all classes [52].MDI measures how much each feature reduces the impurity of the nodes where it is used to split the data, and it calculates feature importance as the sum of impurity decreases across all the splits that include the feature, averaging over all the trees in the ensemble [52].The higher the MDI (Figures 4a, 5a and 6a), the more influential the feature is for the model.A split with a significant decrease in impurity is important for the RF classifier; consequently, the feature and the corresponding split are essential for the model's decision.1), where red indicates higher values and blue indicates lower values.Each dot is representative of a patient and highlights which feature values correspond to SHAP values (X-axis).(c) The most important feature is diarrhea, followed by age, diabetes, BMI, and hypertension, the top 5 most important features in ranking order for predicting MV use.The 'Waterfall plot' (d) indicates that diarrhea results in a composition ratio of >25% (X-axis, top).Furthermore, the top 13 features accumulate 90% of the model's cumulative ratio (X-axis, bottom).
The second method was model-agnostic interpretability based on permutation-based analysis [56,57].In this method, the values of an input feature on the dataset are shuffled, and the model's change in prediction accuracy is recorded for the shuffled feature.The exact process is repeated for other features, keeping the rest of the dataset unshuffled.The features are then ranked based on the variability in model performance [56,57].This permutation-based feature ranking (Figures 4b, 5b and 6b) provides the most important values for the model's performance.Features ranked (Y−axis) based on the decrease in accuracy (X−axis) under the effect of permutations.The 'bee swarm plot' (c) shows that the most important feature is diarrhea, followed by age, diabetes, BMI, and sex, the top 5 most important features in ranking order for predicting ICU.The 'Waterfall plot' (d) indicates that diarrhea results in a composition ratio of >25% (X−axis, top).Furthermore, the top 12 features accumulate 90% of the model's cumulative ratio (X−axis, bottom).Features ranked (Y-axis) based on the decrease in accuracy (X-axis) under the effect of permutations.The 'bee swarm plot' (c) shows that the most important feature is diarrhea, followed by age, diabetes, BMI, and sex, the top 5 most important features in ranking order for predicting ICU.The 'Waterfall plot' (d) indicates that diarrhea results in a composition ratio of >25% (X-axis, top).Furthermore, the top 12 features accumulate 90% of the model's cumulative ratio (X-axis, bottom).
The third (another model agnostic) interpretability method is based on the cooperative game theory that computes SHAP values for each player in a multiplayer game to understand the outcome [58][59][60][61].Shapley values are measured for each feature to identify its contribution to changing the model's decision.The SHAP value for a particular feature is calculated as the change in the difference in prediction from randomly sampling the feature value from a distribution compared to the average prediction across all data instances (see Figures 4c,d, 5c,d and 6c,d).
SHAP and MDI have a similar performance when identifying important features [51]; however, SHAP considers the local and global distributions of features, whereas MDI only provides global importance.SHAP overestimates irrelevant features over relevant features under binary networks [62] and incorrectly identifies important features in out-ofdistribution scenarios [63].However, the permutation-based approach is unique as it can capture other important features undetected by SHAP [64].Features ranked (Y-axis) based on the decrease in accuracy (X-axis) under the effect of permutations.The 'bee swarm plot' (c) shows that the most important feature is BMI, followed by age, diarrhea, hypertension, and sex, the top 5 most important features in ranking order.The 'Waterfall plot' (d) indicates that BMI results in a composition ratio of ~20% (X-axis, top).Furthermore, the top 13 features accumulate 90% of the model's cumulative ratio (X-axis, bottom).
The second method was model-agnostic interpretability based on permutation-based analysis [56,57].In this method, the values of an input feature on the dataset are shuffled, and the model's change in prediction accuracy is recorded for the shuffled feature.The exact process is repeated for other features, keeping the rest of the dataset unshuffled.The features are then ranked based on the variability in model performance [56,57].This permutation-based feature ranking (Figures 4b, 5b, and 6b) provides the most important values for the model's performance.
The third (another model agnostic) interpretability method is based on the cooperative game theory that computes SHAP values for each player in a multiplayer game to understand the outcome [58][59][60][61].Shapley values are measured for each feature to identify its contribution to changing the model's decision.The SHAP value for a particular feature is calculated as the change in the difference in prediction from randomly sampling the feature value from a distribution compared to the average prediction across all data instances (see Figures 4c,d, 5c,d, and 6c,d).Features ranked (Y-axis) based on the decrease in accuracy (X-axis) under the effect of permutations.The 'bee swarm plot' (c) shows that the most important feature is BMI, followed by age, diarrhea, hypertension, and sex, the top 5 most important features in ranking order.The 'Waterfall plot' (d) indicates that BMI results in a composition ratio of ~20% (X-axis, top).Furthermore, the top 13 features accumulate 90% of the model's cumulative ratio (X-axis, bottom).
The 'bee swarm plot' (Figures 4c, 5c and 6c) depicts the SHAP values of each feature distributed for all populations, where the sub-categories of each feature are presented as a 'heatmap' as per the dummy coding (see Table 1).'Cool color' (blue) represents a sub-category coded by a lower number (i.e., '0' denotes 'no-diabetes'), and 'warm color' (red) represents a higher number (i.e., '1' denotes 'diabetes').On the X-axis, the '0' in the middle of the SHAP scale indicates a 'Neutral Point'-a value toward the right supports a positive decision (predicting 'MV', 'ICU', or 'IMCU'); consequently, a value toward the left supports a negative decision (predicting 'no-MV', 'no-ICU', or 'no-IMCU').Features are arranged along the Y-axis based on their importance in ranking order; the more important features are placed at the top, which is assigned using their absolute Shapley values.
SHAP analysis can also be comprehended through the 'waterfall plot' (Figures 4d, 5d  and 6d), where each feature's contribution can be analyzed to understand model explainability.The features on the Y-axis are ranked based on the compositional score (X-axis, top) estimated as the mean of SHAP values across the population for the presented feature.The cumulative score (X-axis, bottom) shows the additive values of features on the model that explain the model's interpretation.
In Figure 4a, the top five features that determine a model's prediction for MV include diarrhea, age, BMI, race, and diabetes, in ranking order, using MDI.The Permutation Importance (Figure 4b) indicates similar top five features; however, sex (ranked 5th) is more important than race (ranked 7th).The SHAP analysis (Figure 4c,d) agrees with the other two methods over the top four features: diarrhea, age, diabetes, and BMI.It ranked hypertension 5th among the most important features.Despite the slight difference, all interpretability methods tend to agree on the most important features contributing to the model's decision making.
Figure 4d shows that the top 13 features contribute 90% to the model's interpretation.Among them, five are from sociodemographic characteristics (age, race, sex, smoking status, and ethnicity), six are from comorbidities (diarrhea, diabetes, BMI, hypertension, pneumonia, and CKD stages 1-4), and two are from the medication category (ACEIs and ARBs).Thus, the findings suggest that the trained models can use the information from each feature category to make a successful prediction.
The features were analyzed similarly for ICU, where MDI (Figure 5a), Permutation Importance (Figure 5b), and SHAP (Figure 5c,d) show that there is consistency across the feature importance for both models: ICU and MV.SHAP showed that sex was among the top 5 features instead of race, as indicated by MDI and Permutation Importance. Figure 5d indicates that the top 12 features contributed to the model's 90% interpretation.Among them, four were from sociodemographic characteristics (age, sex, race, and smoking status), six were from comorbidities (diarrhea, diabetes, BMI, hypertension, pneumonia, and CKD stages 1-4), and two were from the medication category (ARBs, and ACEIs).We found one less socio-demographic feature (ethnicity) than in MV, contributing to the 90% decision.
In the IMCU, MDI (Figure 6a) and Permutation Importance (Figure 6b) demonstrated that BMI, age, diarrhea, and race are the top 4 most important features.However, SHAP (Figure 6c,d) indicated that the top 3 features are identical to the other two methods.In SHAP, BMI is the most important feature, followed by age, diarrhea, hypertension, and sex.
Figure 6d shows that the top 13 features contribute to the model's 90% interpretation.Among them, four were from sociodemographic characteristics (age, sex, race, and smoking status), eight were from comorbidities (BMI, diarrhea, hypertension, diabetes, pneumonia, CKD stages 1-4, COPD, and heart failure), and one was from the medication category (ACEIs).Compared to MV, we found that one sociodemographic (ethnicity) and one medication (ARBs) feature were replaced by COPD and heart failure.Similarly, COPD and heart failure were more important features than ARBs compared to ICU, contributing to the 90% interpretation.
It is worth noting that the results show similar features between MV and ICU, compared to IMCU, as patients requiring MV were also admitted to the ICU.

SHAP Dependence Plot
A 'SHAP dependence plot' demonstrates how the model output differs depending on the interaction of two features [16,65].By examining the scatterplot, we can analyze the pattern and trend of the relationship between a variable and the model's output while considering the other variable [66,67].If there is an interaction effect, it will be evident through distinct patterns along the Y-axis [16,66].
Figure 7 below shows that the sub-groups of one of the categorical variables are on the X-axis, whereas the sub-groups of other categorical variables are on the Y-axis (right).The Y-axis (left) represents the SHAP value that interacts between the two categorical variables associated with each patient, represented as dots.(c,d) Interactions between age-diarrhea and diabetes-diarrhea have a higher impact on predicting ICU admissions.In (c), younger and middle-aged adults, diarrhea shows a sharp rise in prediction (higher SHAP values with red dots); however, diarrhea does not affect the older population much (less density of red dots).The X-axis of (c) represents the three different age categories ('younger, 'middle', and 'older' adults).(e,f) Interactions between pneumonia-diarrhea and age-hypertension have higher impacts on predicting IMCU admissions.

Discussion
The success of Random Forest classifiers in previous studies [23,24,27,[30][31][32] led us to select the same approach for our current study, and the performance of our models after training is comparable with previous studies.This suggests that ML can be a valuable tool for predicting the severity of COVID-19 disease for rapid therapeutic interventions, like MV requirement, ICU, or IMCU admission.Under high-demand conditions, healthcare systems can rely on an alternate, fast intelligence system for therapeutic decision making under critical care.Our study contributes toward understanding COVID-19 disease severity in a large sample from South Florida and identifies essential features utilizing interpretability approaches in AI/ML techniques.This study identified the primary aspects of eHR data that play a crucial role in COVID-19 disease severity.
We extended the feature analysis by comparing multiple interpretability techniques.Our analysis explored how each feature impacts the critical condition and how the interplay between features contributes to the severity of COVID-19 disease.
In this study, we trained three classifiers (RF) to predict each of the three different conditions: (1) MV requirement, (2) ICU, and (3) IMCU admission.We trained the models using 24 independent variables containing information on patients' sociodemographic characteristics, comorbidities, and medications.The F1-scores (weighted) across our three prediction models were 0.89 for MV, 0.87 for ICU, and 0.75 for IMCU, with an AUC of 0.73 for MV, 0.80 for ICU, and 0.64 for IMCU.This performance was comparable to previous findings [15,19,21,25].It is worth noting that the model's accuracy was higher for MV and ICU compared to IMCU.
There is evidence of a type I error in predicting IMCU, possibly due to misclassifying patients' most important features (rank-wise) as false positives, which may not be true in actual cases and are better predictors for MV or ICU instead.In addition, IMCU patients may comprise two different populations: one directly admitted to the IMCU during their initial hospitalization, and another group stepped down from initial hospitalization to the ICU.Thus, the patients in the IMCU have a large diversity of risk factors, and it might be hard for the classifier to learn a better decision boundary.Also, there is evidence of type II errors in predicting IMCU admissions from 'other COVID-19 patients' who have not experienced severe events (MV, ICU, and IMCU).In those cases, presumably, the feature distribution of the 'other COVID-19 patients' class ('no-MV', 'no-ICU', and 'no-IMCU') may not be distinct from IMCU as the illness of the patient admitted to the IMCU is less severe compared to those in the MV or ICU groups.Thus, the model failed to distinguish features between the two classes (IMCU vs. 'no-MV', 'no-ICU', and 'no-IMCU', when taken together), resulting in more false negatives.
We utilized SHAP analysis to better understand the model's decision and compared it with other interpretability methods: MDI and Permutation Importance.The SHAP analysis showed that the top five features (diarrhea, age, diabetes, BMI, and hypertension) for predicting MV were similar across other interpretive methods but in different ranking orders (MDI: diarrhea, age, BMI, race, diabetes; Permutation Importance: diarrhea, diabetes, age, BMI, and sex).
In the ICU, the SHAP analysis showed the top five features (diarrhea, age, diabetes, BMI, and sex), which were almost consistent with the other two interpretability methods (MDI: diarrhea, age, BMI, race, and diabetes; Permutation Importance: diarrhea, diabetes, age, BMI, and race).This shows that the other two methods placed race in the top five features instead of sex.
Similarly, in the IMCU, the SHAP analysis showed the top five features (BMI, age, diarrhea, hypertension, and sex) that are only matched for the top three features with the other two interpretability methods (MDI: BMI, age, diarrhea, race, and smoking status; Permutation Importance: BMI, diarrhea, age, race, and pneumonia).As discussed, IMCU is less severe than the other two disease severities (MV and ICU); hence, the model interprets the features differently than the other two disease severities.These findings were consistent with previous research on the clinical attributes and the prevalence of coexisting medical conditions in COVID-19 patients [15,17,20,25].Even though limited studies compare feature importance across severe conditions, it is important to highlight that only one such study compared feature importance across different age groups, and the risk factors associated with COVID-19 vary across ages [68].
This research investigated how patients with compromised health conditions contribute to worsening the severity of COVID-19 disease.These include diarrhea, diabetes, hypertension, pneumonia, and CKD stages 1-4.This research revealed that patients on medications such as ARBs and ACEs, commonly used to manage high blood pressure and heart failure, decreased the likelihood of COVID-19 disease severity.Other researchers have also reported the protective nature of the medications (ARBs and ACEs) for hypertension [69-72] and agree with our findings.
Finally, we also found some crucial interactions across features.Patients with pneumonia and diabetes, as well as diabetes and diarrhea, are more likely to require MV.In the ICU, diarrhea affects middle-aged adults the most and interacts strongly with diabetes.In the IMCU, the interaction between diarrhea and pneumonia, as well as hypertension and middle-aged or older adults, increase the risk of severity.
We found some common features for those reported as significant in the chi-squared test and features retained by binary logistic regression models, reported in the statistical results, with the top features reported in the three interpretability approaches.The common features reported in MV include age, sex, diarrhea, diabetes, hypertension, CKD stages 1-4, and pneumonia.Race and BMI stand out as top features across interpretability approaches, but they were not significant in the chi-squared test.
In the ICU, age, sex, race, BMI, diarrhea, diabetes, hypertension, and pneumonia were important features in all statistical (chi-square and binary logistic regression) and interpretability methods.Lastly, in the IMCU, age, sex, diarrhea, diabetes, hypertension, CKD stages 1-4, and pneumonia were important features.BMI, race, and smoking status were important across all three interpretability approaches, but not in the statistical methods.
It is important to note that heart-related comorbidities significant in the statistical methods for all three severity conditions were not among the top 10 important features across the three interpretability techniques.
Not many studies have analyzed features using traditional statistics and ML interpretable techniques, as explored in the current study.Wu et al. [73] used seven different interpretability techniques to identify important features from lab biomarker data for predicting COVID-19 disease severity.However, the several biomarkers analyzed in this study are not commonly administered in regular checkups nor frequently tested by clinicians.Thus, our study uniquely contributes to identifying important features over easily accessible eHR data across different COVID-19 severities of disease.

Limitations
We had no opportunity to intervene in the clinicians' data collection process in the current study.Re-designing the prospective study may help eliminate 'patient selection', 'patient interaction', and 'clinician reporting' biases [74].Data were collected when there was a surge of in-patients along with a high rate of death and severity of illness due to limited access to COVID-19 treatments (medications and vaccinations), leading to instances of incomplete, missing, or inaccurate clinical reporting.
Some of these variables had to be removed from the dataset due to unavailability for many patients, such as 'sputum at admission' and 'fever'.This information could have improved our models' predictions if reported properly.Furthermore, the sociodemographic distribution of COVID-19 patients in South Florida could influence the model's performance [74][75][76], leading to a better performance in certain sub-groups (e.g., sex, age, and ethnicity) with a greater population density.
Similarly, the imbalanced test dataset posed a challenge for the model in predicting the minority class (severe cases) accurately.It is important to reduce type II errors where the model fails to detect the minority class, as those are crucial for patient recovery from the severity of disease.Thus, clinicians should not solely rely on the model's predictions, but instead use current clinical trends and judgments.
Models can still perform poorly under outliers and unseen cases due to biased estimations and learning about spuriously correlated features, despite the great strides made to improve the models' performance [74][75][76][77].
Thus, to aid clinicians in stratifying the most important features, we kept all independent variables and avoided feature selection based on statistical analysis, which could lead to underperformance.

Future Direction
In our future work, we will explore the case-specific underperformance of our models, as we found that IMCU prediction results in lower accuracy compared to the other two models.Data augmentation and feature engineering [10,75] based on the significant features reported in the statistics can also help optimize the model's performance.Specifically, feature selection methods, such as Uniform Manifold Approximation and Projection (UMAP), have been shown to significantly improve a classifier's performance when predicting COVID-19 severity [78].Hence, we will implement UMAP feature selection in our future works for a more robust performance.
Second, we will explore a multiclass classifier that simultaneously predicts all three severity cases rather than training separate models for each condition.A comprehensive understanding of disease progression can enhance the transparent and trustworthy knowledge of end users and aid them in making better-informed decisions with the model.
In addition to multiclass classifications, we will utilize SHAP scores to cluster different patient groups, as Khadem et al. [79] reported, identifying the most susceptible cluster based on mortality counts.Cluster analysis may be necessary in future studies to identify the clusters representing a higher severity risk of COVID-19.

Conclusions
Developing an AI-driven decision support system for predicting the critical clinical events of in-patients with COVID-19 disease not only addresses the immediate needs of the pandemic, but also advances the field of AI/ML in healthcare.By leveraging cutting-edge technologies and algorithms, such as ML, researchers and healthcare professionals can unlock the potential of data-driven insights to transform patient care.The application of AI/ML in healthcare extends beyond the COVID-19 disease, holding promise for improving diagnosis, treatment selection, disease surveillance, and patient outcomes across various medical specialties and healthcare settings.By pushing the boundaries of AI/ML in healthcare, we can foster innovation, enhance decision making, and ultimately improve health outcomes for individuals and populations.This knowledge empowers public health authorities to proactively plan and implement targeted interventions, mitigating the impact of disease outbreaks and optimizing healthcare delivery.

Figure 1 .
Figure 1.Work flowchart.The flowchart was modified from our previous study on the same dataset [31].The data inclusion strategy is shown in the flowchart.Out of 5594 hospitalized cases, 5371 are confirmed cases with COVID-19.The inclusion criteria for 25 variables (24 independent and one of three dependent variables (ICU with MV, ICU, or IMCU)) are included in the study after data preprocessing.

Figure 1 .
Figure 1.Work flowchart.The flowchart was modified from our previous study on the same dataset [31].The data inclusion strategy is shown in the flowchart.Out of 5594 hospitalized cases, 5371 are confirmed cases with COVID-19.The inclusion criteria for 25 variables (24 independent and one of three dependent variables (ICU with MV, ICU, or IMCU)) are included in the study after data preprocessing.

Figure 2 .
Figure 2. (a) Performance matrices show that the models' performances in the test dataset yield an F1score of 0.89 for MV, 0.87 for ICU admission, and 0.75 for IMCU.(b) Confusion matrices demonstrate that the models correctly classify 972 instances for MV, 942 instances for ICU, and 842 instances for IMCU.

Figure 2 .
Figure 2. (a) Performance matrices show that the models' performances in the test dataset yield an F1-score of 0.89 for MV, 0.87 for ICU admission, and 0.75 for IMCU.(b) Confusion matrices demonstrate that the models correctly classify 972 instances for MV, 942 instances for ICU, and 842 instances for IMCU.

Figure 3 .
Figure 3.The figures above depict the ROC analysis to evaluate the models' performances in predicting MV, ICU, and IMCU for training and test cohorts.(a) This shows the ROC curve for the training dataset (b).This shows ROC curve for the test data, the prediction of ICU achieves the best performance.It should be noted that FP is 1 minus the specificity (TN), which means that the closer FP is to 0, the higher the sensitivity (TN).Therefore, the point should be in the top-left corner of the ROC curve to obtain the optimal values for specificity and sensitivity.

Figure 4 .
Figure 4. Feature interpretation.(a) Feature importance using MDI in MV; the most important features are depicted in the higher bars.(b) Features ranked (Y-axis) based on the decrease in accuracy (X-axis) under the effect of permutations.The 'bee swarm plot' (c) depicts the graphical representation of feature values based on dummy coding (Table1), where red indicates higher values and blue indicates lower values.Each dot is representative of a patient and highlights which feature values correspond to SHAP values (X-axis).(c) The most important feature is diarrhea, followed by age, diabetes, BMI, and hypertension, the top 5 most important features in ranking order for predicting MV use.The 'Waterfall plot' (d) indicates that diarrhea results in a composition ratio of >25% (X-axis, top).Furthermore, the top 13 features accumulate 90% of the model's cumulative ratio (X-axis, bottom).

Figure 5 .
Figure 5. Feature interpretation.(a) Feature importance using MDI in ICU; the most important features are depicted in the higher bars.(b) Features ranked (Y−axis) based on the decrease in accuracy (X−axis) under the effect of permutations.The 'bee swarm plot' (c) shows that the most important feature is diarrhea, followed by age, diabetes, BMI, and sex, the top 5 most important features in ranking order for predicting ICU.The 'Waterfall plot' (d) indicates that diarrhea results in a composition ratio of >25% (X−axis, top).Furthermore, the top 12 features accumulate 90% of the model's cumulative ratio (X−axis, bottom).

Figure 5 .
Figure 5. Feature interpretation.(a) Feature importance using MDI in ICU; the most important features are depicted in the higher bars.(b)Features ranked (Y-axis) based on the decrease in accuracy (X-axis) under the effect of permutations.The 'bee swarm plot' (c) shows that the most important feature is diarrhea, followed by age, diabetes, BMI, and sex, the top 5 most important features in ranking order for predicting ICU.The 'Waterfall plot' (d) indicates that diarrhea results in a composition ratio of >25% (X-axis, top).Furthermore, the top 12 features accumulate 90% of the model's cumulative ratio (X-axis, bottom).

Figure 6 .
Figure 6.Feature interpretation.(a) Feature importance using MDI in IMCU; the most important features are depicted in the higher bars.(b)Features ranked (Y-axis) based on the decrease in accuracy (X-axis) under the effect of permutations.The 'bee swarm plot' (c) shows that the most important feature is BMI, followed by age, diarrhea, hypertension, and sex, the top 5 most important features in ranking order.The 'Waterfall plot' (d) indicates that BMI results in a composition ratio of ~20% (X-axis, top).Furthermore, the top 13 features accumulate 90% of the model's cumulative ratio (X-axis, bottom).

Figure 6 .
Figure 6.Feature interpretation.(a) Feature importance using MDI in IMCU; the most important features are depicted in the higher bars.(b)Features ranked (Y-axis) based on the decrease in accuracy (X-axis) under the effect of permutations.The 'bee swarm plot' (c) shows that the most important feature is BMI, followed by age, diarrhea, hypertension, and sex, the top 5 most important features in ranking order.The 'Waterfall plot' (d) indicates that BMI results in a composition ratio of ~20% (X-axis, top).Furthermore, the top 13 features accumulate 90% of the model's cumulative ratio (X-axis, bottom).

Figure 7 .
Figure 7. SHAP dependence plot.(a,b) Interactions between pneumonia-diarrhea and diarrheadiabetes have a higher impact on predicting MV.An increase in the interaction between diarrhea (red dots) and pneumonia (scattered dots on the right) impacts the model's output (higher SHAP values).(a) Higher density of red dots in the presence of pneumonia (indicated as categorical values on the x−axis), resulting in an increased likelihood of MV (shown as greater SHAP values) during the presence of two variables.(c,d) Interactions between age-diarrhea and diabetes-diarrhea have a higher impact on predicting ICU admissions.In (c), younger and middle-aged adults, diarrhea

Figure 7 .
Figure 7. SHAP dependence plot.(a,b) Interactions between pneumonia-diarrhea and diarrheadiabetes have a higher impact on predicting MV.An increase in the interaction between diarrhea (red dots) and pneumonia (scattered dots on the right) impacts the model's output (higher SHAP values).(a) Higher density of red dots in the presence of pneumonia (indicated as categorical values on the X-axis), resulting in an increased likelihood of MV (shown as greater SHAP values) during the presence of two variables.(c,d)Interactions between age-diarrhea and diabetes-diarrhea have a higher impact on predicting ICU admissions.In (c), younger and middle-aged adults, diarrhea shows a sharp rise in prediction (higher SHAP values with red dots); however, diarrhea does not affect the older population much (less density of red dots).The X-axis of (c) represents the three different age categories ('younger, 'middle', and 'older' adults).(e,f) Interactions between pneumonia-diarrhea and age-hypertension have higher impacts on predicting IMCU admissions.

Table 1 .
Parameters and characteristics.

Table 1 .
Parameters and characteristics.

Table 2 .
Significant features across patients' likelihood to require MV.
Individual chi-square results of the 24 demographic features predicting MV requirement.

Table 3 .
Significant features for patients' likelihood of being admitted to the ICU.
Individual chi-square results of the 24 demographic features predicting ICU admission.

Table 4 .
Significant features for patients' likelihood of being admitted to IMCU.
Individual chi-square results of the 24 demographic features predicting IMCU admission.

Table 5 .
Backward binary logistic regression for predicting MV requirement.

Table 6 .
Backward binary logistic regression for predicting ICU admission.

Table 7 .
Backward binary logistic regression for predicting IMCU admission.

Table 8 .
Backward binary logistic regressions cross-validated by demographic groups for predicting MV requirement.

Table 8 .
Cont.Only variables that were identified as key predictors are reported for each model.Unique variance; * p < 0.05; ** p < 0.01.'--' in the table indicates the features not retained by the model for the sub-class.

Table 9 .
Backward binary logistic regressions cross-validated by demographic groups for predicting ICU admissions.

Table 10 .
Backward binary logistic regressions cross-validated by demographic groups for predicting IMCU admissions.